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Production of a single top quark provides excellent opportunity for understanding top quark 
physics and Cabibbo-Kobayashi-Maskawa structure of the quark sector in the Standard Model. 

Although an associated production with a 6-quark has already been observed at the Tevatron in 
2009, a single top production in association with a W gauge boson has not been observed till 2014 
at the LHC, where pair production of the top quark serves as the dominant background. Due to the 
kinematic similarity between tW and the dominant background, it is challenging to find suitable 
kinematic variables that offer good signal-background separation, which naturally leads to the use 
of multivariate methods. In this paper, we investigate kinematic structure of tW + j channel using 
Mt 2 and invariant mass variables, and find that tW + j production could well be separated from ti 
production with high purity at a low cost of statistics when utilizing these kinematic correlations. 
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I. INTRODUCTION 

The research program at the Large Hadron Collider 
(LHC) has been greatly successful in the sense that it not 
only discovered a new scalar state [1, 2], which is consis¬ 
tent with the Higgs boson in the Standard Model (SM), 
but rediscovered the SM with great precision. Among the 
precision studies, the top quark (t) has received a particu¬ 
lar attention as it is, and also as a window to new physics 
discovery. In fact, the LHC, dubbed as “top factory”, is 
capable of copiously producing top quarks in pair via 
the strong interaction. Although mediated by the elec- 
troweak interaction, the production rate of a single top 
quark is quite sizable due to a large center of mass energy 
so that the LHC can provide with an ideal environment 
to study the single top modes as well. In the SM, the rel¬ 
evant production cross section of a single top is directly 
proportional to squaring one of the Cabibbo-Kobayashi- 
Maskawa (CKM) matrix elements, V){,, so that single top 
channels serve as a way to measure the parameter. On 
top of this parameter measurement, their cross section 
measurement is also sensitive to various new phenomena 
such as forth-generation models and models with flavor¬ 
changing neutral currents [.'I]. 

The production of a single top through s-channel and 
t-channel W gauge boson exchanges had been observed, 
at the 5.0 standard deviation level of significance, sepa¬ 
rately by DO [ ] and by CDF [o], whereas the associated 
production of a single top with a W gauge boson (hence¬ 
forth denoted by tW) had too small a cross section to 
be observed at the Tevatron. Nevertheless, the discovery 
of the tW channel becomes of great importance in the 
sense of 1) a way of confirming the SM in the top sec¬ 
tor, 2) a way or cross-check of |Ftb| measurement, and 
3) a possible link to new physics searches such as bot¬ 
tom partners [6, 7]. The LHC experiment has been able 
to reach a sufficient production cross section to see the 
tW mode [8, 9] only in five years after the discovery of 
s-channel and t-channel single top modes, and the combi¬ 


nation of their cross section measurements can be found 
in Ref. [10]. The ATLAS and CMS collaborations have 
devoted a lot of effort to develop a variety of sophisti¬ 
cated multivariate techniques that take advantage of the 
differences in the kinematic distributions between the rel¬ 
evant signal and backgrounds, i.e., the method of Boost 
Decision Tree (BDT) for the CMS and the method of 
Multi-Variate Analysis (MVA) for the ATLAS. Yet, there 
is no single kinematic variable that serves the reasonable 
separation between the signal and backgrounds. 

The signal channel is defined by the process shown in 
the left panel of Figure 1, while the major background to 
this channel is identified as the ordinary pair-produced 
top quarks for which one of the bottom quarks is missed 
(typically by transverse momentum and pseudo-rapidity 
acceptance). Although the corresponding probability 
may not be large, the overwhelming production rate of tt 
can give rise to a sizable background to the signal process. 
This expectation is clearly reflected in the CMS analy¬ 
sis of Ref. [10]. Their signal region is defined by exactly 
one 5-tagged jet (together with two LF’s). Although the 
signal region predominantly contains tW and tt events 
(after their selection criteria), ti is still ~ 5 times larger 
than tW, (again motivating the adoption of multivariate 
techniques as a posterior data analysis scheme). 

It is interesting to compare the kinematic feature be¬ 
tween the tW and the ti systems. First of all, the bottom 
quark comes from the decay of a top quark with a W 
gauge boson for both signal and background processes, 
i.e., the typical hardness and the directional preference 
of the bottom quark are similar. Therefore, its kinematic 
ensemble for both tW and ti, e.g., the distribution in the 
transverse momentum, tends to be close to each other. 
An analogous argument is readily applicable to the lep¬ 
ton. For both signal and background, it is emitted from a 
W boson along with a neutrino, so that the typical hard¬ 
ness and the directional preference are anticipated to be 
similar. Along the line of this observation, it is not sur¬ 
prising that other variables induced from the momenta 
of 6-quarks and leptons do not show a reasonable perfor- 
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FIG. 1: A sample Feynman diagram of the associated pro¬ 
duction of a single top with a W gauge boson and their sub¬ 
sequent decay (left panel) and one with an extra jet attached 
(right panel). 
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mance in separating the signal and background events. In 
other words, it is rather difficult to find suited kinematic 
variables that offer good signal-background discrimina¬ 
tion. 

Provided with such a challenging situation, we here 
propose an alternative kinematic variable-based strategy 
which could have expedited the observation of the single 
top mode associated with a W gauge boson. The main 
idea behind our proposal can be summarized as follows. 
We basically require an additional jet on top of a bottom- 
tagged jet, two opposite-signed leptons, and a (large) 
missing transverse energy in the final state. Such an ex¬ 
tra jet can be either 6-tagged or not, i.e., 2b + £~^£~ -\- pr 
or Ih + \j -\- £~^£~ + pr, correspondingly. For the latter 
signal region, we proceed exactly the same analysis as 
the former, i.e., we treat the additional non-6-tagged jet 
as if it were a bottom-initiated jet. With this require¬ 
ment, the background restores the regular dileptonic tt 
event topology.^ On the other hand, the signal process 
comes with a single &-quark at the leading order, so that 
higher order contributions are essential to meet the re¬ 
quirement, i.e., demanding an extra jet to attach to the 
leading order process. An example diagram is illustrated 
in the right panel of Figure 1. Unlike the background, 
the tW with an additional jet has an ill-defined event 
topology because such a jet is typically from either initial 
state radiation (ISR) or final state radiation (FSR). We 
then apply the well-known Mt 2 variable [11-14] and the 
conventional invariant mass variable formed by a bottom 
quark and a lepton, itim- While the background yields 
upper-bounded distributions in those variables, the sig¬ 
nal distributions are expected to stretch further beyond 
the kinematic endpoints of the background, for which the 
details are dictated by the hardness of the extra jet. It is 
therefore expected that a large fraction of signal events 
survive even with kinematic cuts in the Mt 2 and mbe 
while the background events are significantly suppressed. 
A related approach has been examined in Ref. [15] to 
solve combinatorial issues with ISR in new physics signals 
involving jets. Our approach is different, and with our 


^ Of course, one of the two 6-jets can be either 6-tagged or not as 
well. 


FIG. 2: The dileptonic tt decay process with the correspond¬ 
ing symmetric subsystems explicitly specified. The blue dot¬ 
ted, green dot-dashed, and black solid boxes indicate subsys¬ 
tems (bb), {££), and {blbl), respectively. 

findings we suggest to use ISR to suppress backgrounds 
in the given final state for an expedite discovery and pre¬ 
cision measurement. 

The rest of this paper is organized as follows. In the 
next section, we briefly review the Mt 2 variable, taking 
the dileptonic tt as a concrete example. In Sec. Ill, we 
discuss behaviors of tt and tW in the Mt 2 and rribi vari¬ 
ables with the requirement of lb + 21 + pr. We then 
re-examine their behaviors in those variables with an ad¬ 
ditional jet requirement in Sec. IV. Sec. V is reserved for 
our discussions and outlook. 


II. A REVIEW ON THE Mt2 VARIABLE 

Mt 2 and ruM variables are well-motivated especially 
for a cascade decay of a heavy particle including two- 
step two-body decays such as the top decay, and there¬ 
fore, it makes sense to investigate them for the tW case. 
While rribi is (relatively) well-known, the Mt 2 variable 
has non-trivial and less familiar features. In this sense, 
we provide a brief review on the Mt 2 variable that is 
employed for the analyses in the following sections. For 
concreteness of the discussion later, we take the event 
topology defined by the pair-produced top quarks which 
subsequently decay dileptonically (see also Figure 2): 

tt bW^bW~ b£+M~D. (I) 

We also take the decay sequence initiated by the top 
quark as the first decay side, while that by the anti-top 
quark as the second decay side solely for convenience. 

The Mt 2 variable was originally proposed as a sim¬ 
ple generalization of the well-known transverse mass to 
the case where each of the pair-produced heavier parti¬ 
cles decays into an invisible particle along with a visible 
state [11-14]. Since the total missing transverse momen¬ 
tum is shared by the two invisible particles, its formal 
definition is given by a minimization of the maximum of 
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the two transverse masses {M^'^ and M^'^) in each de¬ 
cay chain over the transverse components of the invisible 
momenta (denoted by and subject to the 

constraint, i.e., the total sum of the transverse momenta 
should identically vanish: 
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where rh denotes the hypothetical/test mass parameter 
for the invisible particles and the superscripted numbers 
indicate the associated decay side. When more than one 
visible particle is involved in each decay chain, then one 
can define Mt 2 in various subsystems [13] which can 
be further categorized into symmetric and asymmetric 
subsystems whether or not both M^^’s {i = 1,2) are 
constructed in the same fashion. For the case of the ti 
system, there are three symmetric subsystems which are 
henceforth denoted by {bb), {££), and {b£b£) subsystems 
as per the visible particles associated with the subsystem 
under consideration. We explicitly delineate those three 
subsystems in Figure 2, and the operational difference 
among them is summarized below: 



• For the {b£btj subsystem, the transverse masses for 
the top quarks are minimized with the neutrinos 
considered as invisible particles. 

• For the {££) subsystem, the transverse masses for 
the are minimized with the neutrinos consid¬ 
ered as invisible particles. The visible momenta for 
the bottom quarks are considered as upstream mo¬ 
menta. 

• For the {bh) subsystem, the transverse masses for 
the top quarks are minimized with the consid¬ 
ered as invisible particles. The visible momenta for 
the leptons are considered as downstream momenta 
so that they are treated invisibly. 


FIG. 3: The dileptonic tt decay process with the correspond¬ 
ing asymmetric subsystems explicitly specified. The blue dot¬ 
ted and red solid boxes in the left panel and the green dot- 
dashed box in the right panel indicate subsystems {bii), (bib), 
and (bi), respectively. 


• For the (bib) subsystem, the transverse masses for 
the top quarks are minimized with the neutrino in 
one decay side and the in the other decay side 
considered as invisible particles. The visible mo¬ 
mentum for the remaining lepton is considered as 
downstream momenta so that it is treated invisibly. 


Since the neutrino plays a role of the invisible particle in 
the {b£bi) and {££) subsystems, the relevant test mass is 
typically assumed to be 0 GeV as per the SM neutrino 
mass. Analogously, for the (bb) subsystem, the relevant 
test mass is typically assumed to be 80 GeV as per the 
mass of the W gauge boson. 

Similar constructions can be performed for the asym¬ 
metric subsystems [16]. In this case, there arise three dif¬ 
ferent subsystems denoted by {b££), (bib), and {bl) again 
named after the visible particles associated with the sub¬ 
system of interest. The corresponding subsystems are 
explicitly delineated in Figure 3, and the operational dif¬ 
ference among them is explained below: 

• For the {b££) subsystem, the transverse masses for 
the top quark in one decay side and the in 
the other decay side are minimized with the neu¬ 
trinos considered as invisible particles. The visible 
momentum for the remaining bottom quark is con¬ 
sidered as upstream momenta. 


• For the (b£) subsystem, the transverse masses for 
the top quark in one decay side and the in the 
other decay side are minimized with the neutrino in 
one decay side and the in the other decay side 
considered as invisible particles. The visible mo¬ 
menta for the remaining bottom quark and lepton 
are considered as upstream and downstream mo¬ 
menta, respectively, the latter of which is treated 
invisibly. 

Since the neutrino is considered as the invisible particle 
in both decay sides for the (b££) subsystem, the relevant 
test mass is typically assumed to be 0 GeV as per the 
SM neutrino mass. On the contrary, in the other two 
subsystems, two different particle species take over the 
role of invisible particles so that two different test masses 
can be imposed, accordingly, i.e., 0 GeV and 80 GeV as 
per the masses of the SM neutrino and W gauge boson, 
depending on the subsystem of interest. 
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One noteworthy fact is that the associated Mt 2 dis¬ 
tributions are bounded above by the mass of the decay¬ 
ing particle.^ In fact, the analytic expressions for the 
kinematic endpoints can be written in terms of the mass 
parameters involved in the decay process [11-14], and 
interestingly enough, if the test masses are the same as 
the masses of invisible particles in the relevant subsys¬ 
tem, the maximum Mt 2 value is the same as the heavier 
of the actual masses of the particles whose transverse 
masses are minimized. For our ti example, subsystems 
{MM), (bb), {b££), (bib), and {b£) simply return the top 
quark mass while subsystem {££) simply returns the W 
mass if each of the test masses is imposed correspond¬ 
ingly. 


III. tW AT THE LEADING ORDER: EXISTING 
ANALYSES 

We first discuss collider signatures of dileptonic tW 
channel at the leading order together with a brief re¬ 
view on the corresponding experimental measurements 
conducted by CMS/ATLAS collaborations [8, 9]. For 
more concrete discussions later on, Monte Carlo event 
samples of ti and tW including realistic effects such as 
detector resolutions have been prepared. For both signal 
{tW) and background (ti), the parton level events at the 
leading order are generated by MadGraph_aMC@NLO [17] in 
conjunction with parton distribution functions given by 
NNPDF23 [18] that is the default of MadGraph_ciMC@NLO. 
Both top quark and W boson are forced to decay in¬ 
side MadGraph_aMC@NLO to include the spin-correlation 
and off-shell effects. The outcomes {ti and tW events) 
from the parton event generator are subsequently fed 
to Pythia6.4 [19] for the showering and hadronization. 
Then those events are further processed to Delphes3 [20] 
for describing the detector effects. All the simulation is 
done with a proton-proton collider of y/s = 8 TeV and 
an input top mass of 173 GeV. Note that here we do 
not simulate signal and background processes with ex¬ 
tra radiation (e.g., ti + j) at the generation level, as the 
showering by Pythia module can effectively take care of 
the relevant diagrams [21-23]. 

Given the final state defined by the dileptonic tW at 
the leading order, i.e., b£~^£~ + ftp with £ being either e 
or ji, several SM processes can give rise to the same vis¬ 
ible final state. It turns out that among them dileptonic 
ti is the dominant background where one of 5-quarks is 
lost, and therefore, we focus on the comparison between 
the two processes throughout this paper. To be mostly 
left with tW and ti events, we closely follow the event 


^ Strictly speaking, this statement is true only if the actual event 
comes from a well-defined decay topology. We will see that this is 
not the case for our signal process, i.e., tW+j from an ill-defined 
decay topology. 


selection scheme employed in Ref. [8], among which the 
key criteria are enumerated below: 


m Ni = 2 with opposite electric charges, 

> 10 GeV and [ry'^'")] < 2.5 (2.4), (3) 

• fjp > 50 GeV for the same flavor channels, (4) 

• mu > 20 GeV and \mu — mz\ > 10 GeV, (5) 

• Nj = 0 while = 1, > 20 (30) GeV and 

[Ty^'^'')] < 4.9 (2.4), (6) 


where Ni and denote the number of selected leptons 
and jets (5-tagged jets), respectively, and fip is defined 
as IX^iArl = I “-TtI i being all detected particle 
species. Jets are formed by the anti-A:* algorithm [21] 
together with a radius parameter R = 0.5, and the 5- 
tagging efficiency is hardwired to be 70 %, while the light 
quark jets are mis-tagged by 1% [8] A jet is tagged as 
a 5-jet if its direction lies in the acceptance of the tracker 
and if it is associated to a parent 5-quark [20] . 

Having the events passing the above-given selection 
cuts, we first show that conventional kinematic vari¬ 
ables such as Mt 2 for three available subsystems and 
mu would not help us separate the tW events from the 
ti ones. The relevant distributions are exhibited in the 
upper-left panel {Mt 2 in the {££) subsystem), the upper- 
right panel {Mt 2 in the {b££) subsystem), the lower-left 
panel {Mt 2 in the {h£) subsystem), and the right panel 
{mu) of Figure 4. Speaking of the Mt 2 variables in var¬ 
ious subsystems, we see that both of ti (blue dashed his¬ 
tograms) and tW (red solid histograms) develop similar 
distributions in them. For the case of ti, the distribution 
in each subsystem is nothing but the one anticipated in 
the respective subsystem, and therefore, the associated 
kinematic endpoint is expected to be the same as the 
W gauge boson mass {Mt 2 of subsystem {££)) or the 
top quark mass {Mt 2 of subsystems {b££) and (5^)) with 
test masses imposed correspondingly as mentioned ear¬ 
lier [13]. The theoretical endpoints are indicated by black 
dashed lines, and we see that most of ti events are pop¬ 
ulated below them as expected. The small overflow in 
the Mt 2 distributions for the {££) and {b££) subsystems 
is due to various sources such as mis-measurement of flp 
and parton showering/fragmentation (see, for example. 
Ref. [26] for more systematic study on the effect of those 
sources). On the other hand, for the Mt 2 distribution 
in the {b£) subsystem, it is hard to find out kinematic 
configurations corresponding to the relevant endpoint so 
that the distribution does not reach the expected end¬ 
point. When it comes to the signal process, in some 
sense, the hnal state of tW does not differ from that of 
ti. For example of the {££) subsystem, while the Mp 2 


® In Ref. [_,i], the CMS collaboration observed similar tagging ef¬ 
ficiency and mis-tag rate in events from multijet and top-quark 
pair productions. 
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FIG. 4: Mt 2 distributions in various subsystems - subsystem {££) (upper-left panel), subsystem {b££) (upper-right panel), 
and subsystem {b£) (lower-left panel) - and the rribt distributions (lower-right panel) for ti and tW events. The distributions 
are plotted with the events passing the selection criteria listed in Eqs. (3) through (6). The combinatorics arising in Mt 2 for 
subsystems (b££) and (b£) and rribe is treated by choosing the smaller of the two possible values in each variable. The test mass 
for Mt 2 is 0 GeV for subsystems {££) and {b££), while for subsystem {b£) 0 GeV and 80 GeV are imposed for the lepton side 
and the bottom side, respectively. The dashed lines indicate the expected endpoints of the ti system. 


for tW can be interpreted as the one applied to the sit¬ 
uation where W gauge bosons are pair-produced with a 
non-zero transverse upstream momentum given by a bot¬ 
tom quark, the net upstream momentum for ti is defined 
by a vector sum of the transverse momenta of two bot¬ 
tom quarks. Therefore, both distributions are expected 
to be upper-bounded by the same endpoint as well as 
to develop similar shapes up to the details of upstream 
momenta. A similar analogy is relevant to Mt 2 for the 
other two subsystems. In this case, however, the ti is 
interpreted as a single top production associated with 
a W gauge boson with a missing 6-jet absorbed into the 
upstream momentum. Again, signal and background dis¬ 
tributions are expected to be bounded above by the same 
endpoint, and are inclined to exhibit similar shapes up to 
the details of upstream momenta. From all these obser¬ 
vations, we conclude that Mt 2 S. in various subsystems 
are not good signal-background discriminators. 


Finally, taking the mbi distribution (the lower-right 
panel of Figure 4) into consideration, we see very similar 
behaviors for both tW (red solid histogram) and ti (blue 
dashed histogram). Here since there exists a two-fold 
combinatorial ambiguity [27], we keep only the smaller 
of the two to ensure the boundedness of the mu distri¬ 
butions. For both of them, the kinematic endpoint is 
dictated by the correct combination, i.e., the invariant 
mass formed by 6 and £ belonging to the decay cascade 
initiated by the same top quark, so that the expected 
maximum mbt, should be identical, that is, 

rKr = (7) 

where all final state particles, i.e., bottom jet, lepton, 
and neutrino, are assumed massless. Again, the theoret¬ 
ical endpoint is indicated by a black dashed line, while 
the actual distributions involve a small overflow that is 
mostly stemming from the events where an ISR jet is 
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mis-tagged as a bottom quark-initiated jet and off-shell 
effects.^ In addition to the correct combinations, even 
the ensemble of incorrectly-combined is anticipated 
to be similar to each other because the lepton in the 
wrong combinatorial side is emitted from the common 
particle species W for both tW and ti. Of course, there 
may be a difference between the W from the decay of a 
top quark and the W in association with a top quark. 
Our simulation result shown in the lower-right panel of 
Figure 4, however, suggests that such a difference be in¬ 
significant.^ All these observations above confirm that 
mbi as well is not an ideal kinematic variable for discrim¬ 
inating signal events from background ones. 

The poor efficiency in separating tW and ti events by 
using a few simple kinematic variables can motivate to 
employ a more sophisticated method. As a matter of 
fact, the ATLAS and CMS collaborations have made use 
of Boost Decision Tree (BDT) [-30] for the purpose of re¬ 
jecting more background events with more signal events 
retained. The BDT is a type of multivariate analysis 
(MVA), which is a category of analysis methods that 
combine multiple input variables into a single discrim¬ 
inant. A BDT takes a number of input variables (cho¬ 
sen by the analyst) and trains a certain number of deci¬ 
sion trees to separate the signal and background based 
on Monte Carlo samples for each (for both CMS and 
ATLAS, it was tW vs. ft, and other backgrounds were 
not included). To improve signal acceptance and back¬ 
ground rejection with reliable performance, the relevant 
machine-training is “boosted” by giving a special weight 
to the cases where signal events are eventually identified 
as background events and vice versa. It has served very 
well the purpose of signal-background separation in the 
context of tW discovery. However, it is rather difficult 
to find variables yielding the best sensitivity so as to dis¬ 
criminate tW from backgrounds event-by-event. In ad¬ 
dition, the eventual performance highly depends on the 
training samples, so that the internal procedure is rather 
obscure. 


IV. tW WITH INITIAL STATE RADIATION: 

AN ALTERNATIVE STRATEGY 

Motivated by the challenging situation in separating 
the signal events from the background ones using simple 
kinematic variables, we propose an alternative kinematic 
variable-based strategy of enhancing the relevant signal- 


^ In principle, the NLO corrections may affect the kinematic dis¬ 
tributions including rribt [28]. However, we expect that the as¬ 
sociated effect is not significant, for example, based on the com¬ 
parison of some kinematic distributions of tW at LO and NLO 
in Ref. [29]. 

® We also produced the invariant mass distributions in the larger 
of the two combinations, and found that tt and tW give rise to 
almost the same spectra. 


over-background. The basic idea behind it is to consider a 
higher order contribution, that is, a simple attachment of 
an extra jet (see the right panel of Figure 1 as an example 
event topology). The additional jet can be either mis- 
tagged as a bottom-initiated jet or not, and we consider 
both cases separately later on. Hence, we define a couple 
of signal regions whose final states are characterized by 
two opposite-signed leptons, a large missing energy, and 
two (one) 5-tagged and zero (one) ordinary jets: 

Signal region I (SR-I): pp ^ 2b + £^£~ + ^. (8) 

Signal region H (SR-H): pp ^ lb + Ij + £^£~ + .^) 

We particularly emphasize that the discriminating power 
of Mt 2 and rubi can be dramatically improved for tW 
with an extra jet. For the background process (i.e., ti), 
the requirement of SR-I simply retrieves the entire dilep- 
tonic decay topology of top pairs so that the associated 
decay topology is totally we^-defined. For SR-H, even 
if an extra jet is not 6-tagged, we expect that the rel¬ 
evant final state of the background comes mostly from 
the dileptonic top pairs, i.e., the associated decay topol¬ 
ogy is as well-defined as that of SR-I. On the contrary, 
for the signal process, an extra jet is typically emitted 
as initial or final state radiation, and thus the relevant 
event topology is iH-defined. The main idea behind the 
proposed strategy is actually to tackle such a difference. 
Basically, the distributions of ti events in Mt 2 of the six 
subsystems and rubt are bounded above, and their upper 
bound (i.e., kinematic endpoint) can be easily calculated 
like the case considered in the previous section. On the 
other hand, for tW, the extra jet coming from ISR can 
be arbitrarily hard so that the corresponding endpoints 
in the Mt 2 and rubi distributions are completely dictated 
by the hardness of such an additional jet. 

A. Signal region I: pp —>■ 26 -k + fSr 

We begin with the discussion for signal region I, fol¬ 
lowed by that for signal region H in the next subsec¬ 
tion. The event selection scheme for SR-I is the same as 
Eqs. (3) through (6) with an additional 6-tagged jet. Now 
that we require an additional jet in the final state, the 
Mt 2 of the {££) subsystem is not substantially affected. 
The extra jet can be absorbed into the upstream momen¬ 
tum with respect to the {££) subsystem, i.e., it is simply a 
redefinition of the ensemble of the upstream momentum 
that exists in the {££) subsystem Mt 2 for the leading or¬ 
der case. On the other hand, the Mt 2 distributions for 
the other five subsystems show a significant difference 
between ti and tW. We first exhibit the unit-normalized 
Mt 2 distributions of symmetric subsystems in Figure 5: 
{b£b£) subsystem in the upper-left panel and [bb) subsys¬ 
tem in the lower-left panel. The blue dashed and the red 
solid histograms correspond to ti and tW systems, re¬ 
spectively. Here the test masses are chosen to be 0 GeV 
and 80 GeV for the {b£h£) and the {bh) subsystems, cor¬ 
respondingly, while the black dashed lines denote the the 
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FIG. 5: Mt 2 distributions of tt and tW events for the (hlbl) (upper-left panel) and (hb) (lower-left panel) subsystems and their 
corresponding selection efficiencies (right panels) in SR-I. The distributions are plotted with the events passing the selection 
criteria in Eqs. (3) through (6) with one more 6-tagged jet is required. The relevant combinatorics arising in the (blbi) subsystem 
is treated by choosing the smaller of the two possible Mt 2 values. The test mass for the (blbl) subsystem is 0 GeV, while that 
for the {bb) subsystem is 80 GeV. The dashed lines indicate the expected endpoints of the tt system. 


theory predictions for the Mt 2 endpoints of the ti sys¬ 
tem. The well-known two-fold ambiguity arising in the 
(Mbi) subsystem is treated by taking the smaller of the 
two possible Mt 2 values. We clearly see that most of the 
tt events are confined below the expected kinematic end¬ 
point, whereas a large fraction of the tW events exceed 
the kinematic endpoints for the tt system. Therefore, if 
one sets the cut near the kinematic endpoint, i.e., keep¬ 
ing the event whose Mt 2 value is greater than the cut, 
one can reject most of the background events with many 
signal events retained. 

Given the way of keeping or rejecting events with re¬ 
spect to a fixed Mt 2 cut, the associated efficiencies can 
be defined as a ratio of the number of events passing the 
cut to the total number of events: 

tt/tw ^ (after Mt 2 cut) 

iV**/*^ (before Mt 2 cut) 

Note that efficiencies with the invariant mass will 


be defined in a similar fashion. The right panels of Fig¬ 
ure 5 demonstrate the associated efficiency curves for the 
tt and tW in the Mt 2 cuts. They clearly show that the 
signal efficiency, (red solid curves) overwhelms the 
background efficiency, e“ (blue dashed curves) as the cuts 
are close to or beyond the tt kinematic endpoints (black 
dashed lines). 

A similar analysis can be conducted for the three asym¬ 
metric subsystems that are exhibited in Figure 6: the 
{b£b) subsystem in the top panels, the (bii) subsystem in 
the middle panels, and the (bi) subsystem in the bottom 
panels. The Mt 2 distributions are shown in the left pan¬ 
els, while the corresponding efficiency plots are shown in 
the right panels. Since the invisible particles in the {btb) 
and {bt) subsystems are different in both decay legs, the 
relevant test masses are applied accordingly, i.e., 0 GeV 
for the decay leg involving a lepton and 80 GeV for the 
decay leg involving only a bottom. On the other hand, 
the {btt) subsystem assumes identical invisible particles 
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FIG. 6: Mt 2 distributions of tt and tW events for the {Mb) (upper-left panel), {hU) (middle-left panel), and {bl) (lower-left 
panel) subsystems and their corresponding selection efficiencies (right panels) in SR-I. The distributions are plotted with the 
events passing the selection criteria in Eqs. (3) through (6) with one more 6-tagged jet is required. The relevant combinatorics 
arising in all subsystem is treated by the scheme in the text and Table I. The test mass for the decay side involving a lepton 
(only a bottom quark) is 0 GeV (80 GeV). The dashed lines indicate the expected endpoints of the tt system. 


(here neutrino) so that a common test mass of 0 GeV is 
employed. Note that there arises a combinatorial issue 
for all asymmetric subsystems. For any given event, there 
are two partitionings depending on the way of grouping 
one lepton and one bottom quark, and for each parti¬ 
tioning two Mt 2 values are available. To resolve this 
combinatorial ambiguity, we follow the prescription used 


in Ref. [.31] with a slight modification, summarizing as 
follows. As mentioned above, each partitioning has two 
Mt 2 values, smaller and larger. Suppose that for one par¬ 
titioning we have the smaller value a and the larger value 
A, while for the other partitioning we have the smaller 
value b and the larger value B. When ordering those four 
values, we have six possibilities. As one of the partition- 
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Ordering 

bBaA aAbB baBA baAB abBA abAB 

Selection 

B A B A B A 


TABLE I: Six possible orderings in mi,e and Mt 2 of three 
asymmetric subsystems and selection scheme in each ordering. 
For each ordering, the left-to-right sequence is from the lowest 
value to the highest. Out of four values, only the values in 
the second row are plotted in the relevant distributions. 

ings is correct, either A or i? is surely correct. However, 
we are unaware a priori which is the case. Here we sim¬ 
ply choose the smaller out of A and B as a conservative 
approach. For the tW with an extra jet, this prescription 
is subtle because the relevant kinematic endpoint can be 
arbitrarily high as explained before. But we apply this 
selection scheme for every single event as if it belonged to 
the dileptonic ti. Those ordering and selection rule are 
tabulated in Table I. In principle, this selection scheme is 
not unique, and other possibilities are still available (see 
Ref. [27], for example. Ref. [27] also investigated effi¬ 
ciencies and purities by varying invariant mass and Mj ’2 
cuts). We attempted other possible selection schemes 
and found that the above-described prescription is the 
best for signal-background separation. 

Producing the distributions in Figure 6 according to 
the prescription, we observe a clear separation between 
the signal and background events in all three subsys¬ 
tems. Most of the background events are populated be¬ 
low the expected kinematic endpoint for the tt while a 
large number of signal events can be found even beyond 
the endpoint. Again, if the cut is applied near the kine¬ 
matic endpoint, most of the background events can be 
suppressed with many signal events kept. This expecta¬ 
tion is consistently supported by the associated efficiency 
curves in the Mt 2 cuts. Like the cases in the symmetric 
subsystems, they also show that the signal efficiency de¬ 
noted by red solid curves predominates the background 
efficiency denoted by blue dashed curves as the cuts are 
near or beyond the tt kinematic endpoints indicated by 


black dashed fines. 

It is interesting to understand this overflow phe¬ 
nomenon of the signal in the Mt 2 distributions of various 
subsystems by investigating its asymptotic behavior in 
the presence of a very hard 6-jet that typically emerges 
due to a mis-tag of an ISR jet. By definition of Mt 2 
given in Eq. (2), it is sufficient to evaluate the global 
minimum of the transverse mass for the decay side hav¬ 
ing such a hard 6-jet, assuming that it is solely for 
convenience. 

(m™)“ = + + 

where and are the transverse energy and 

transverse mass formed by all visible particles belong¬ 
ing to the first decay side. One then can prove that the 
global minimum of the above transverse mass is given by 

+ rhi, ( 11 ) 

where simply implies the invariant mass formed by 

the relevant visible particles [-■12, oil]. More specifically, if 
is formed by a bottom and a lepton, it is evaluated 

by 

" = 2EbEi{l - cos Ou), (12) 

where Ou denotes the intersecting angle between 6 and 
1. One can easily see that it can be arbitrarily large as 
the bottom becomes arbitrarily hard unless 6 and ^ are 
extremely collinear. Thus, Eq. (11) can be arbitrarily 
large, and in turn, so can Mt 2 - This argument is readily 
applicable to the subsystems where at least one of the 
decay sides involves a lepton and a bottom at the same 
time: for example, subsystems (MM), (Mb), and {b£i). 

If vanishes, however, this argument gets subtle, 

and thus it is better to look at the full expressions of both 
Mt’s: 


= ml + 2 , 

- 1 - 7712 - 1-2 

« -1-7712-1-2 +Pt'^'^ ■ I 


(13) 


(14) 


where in the second line of Eq. (14) we used the as¬ 
sumption that is hard enough to dominate over the 
total visible momentum, i.e., ^ Pt '^'^- To mini¬ 

mize Eq. (13), should be either zero or parallel to 
p^^^\ But then Eq. (14) becomes very large unless p^^^^ 


is anti-parallel to Py^^^. On the other hand, to mini¬ 
mize Eq. (14), should be set to be anti-parallel to 

p^^^\ which makes Eq. (13) become very large. So, the 
solution is likely to happen in a certain intermediate con¬ 
figuration. However, both Eqs. (13) and (14) are quickly 
rising as is away from those extreme configurations 
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due to the largeness of , and therefore, the final Mt 2 
value is very likely to be large. 

A similar observation can be made for the uim dis¬ 
tribution using an analogous argument. Again, the re¬ 
quirement of an additional jet on top of a bottom-tagged 
jet and two opposite-signed leptons retrieves the entire 
decay topology of the dileptonic tt system, so that the 
invariant mass variable is upper-bounded as in the case 
of Sec. III. On the other hand, the additional jet, which 
is mis-tagged as a bottom quark in SR-I, can be arbi¬ 
trarily hard, thus the relevant invariant mass evaluated 
with it can be arbitrarily large as explained in Eq. (12) 
and thereafter. We therefore expect that the rriM distri¬ 
bution for ti is bounded above, whereas that for tW is 
featured by a large tail stretching even beyond the ex¬ 
pected endpoint of the tt system. Obviously, there 
arises a combinatorial issue in having the mhe distribu¬ 
tions. For the treatment of wrong combinations in mhi, 
we again follow the prescription used in Ref. [31], being 
adopted for the Mj ’2 variables in the asymmetric sub¬ 
systems. Having such a selection scheme in our mind, 
we plot the rribi distributions for tt and tW in Figure 7 
where the signal and the background distributions are de¬ 
scribed by the red solid and the blue dashed histograms, 
respectively. As the selection scheme preserves the kine¬ 
matic endpoint of the uibi distribution for the tt system 
(see also Eq. (7)), we denote such a theoretical endpoint 
by the black dashed line. We clearly see that for a large 
fraction of signal events, the associated value ex¬ 
ceeds the kinematic endpoint as expected. Like Mt 2 , if 
one imposes a nibe cut near the tt kinematic endpoint, 
i.e., keeping the event whose nibi value is greater than 
the cut, one can reject most of the background events 
while retaining many signal events. The right panel of 
Figure 7 shows the associated efficiency curves for the tt 
and tW in the mw cuts. We again observe that the signal 
efficiency (red solid curve) is better than the background 
efficiency (blue dashed curve) as the cut is close to or 
beyond the tt kinematic endpoint (black dashed line). 

To look at the signal-background separation of each 
variable more closely, we plot the Receiver Operating 
Characteristic (ROC) curves in Figure 8. The right panel 
of it magnifies the region where the background rejections 
are large. The ROC curve showing the best performance 
(i.e., large signal efficiency as well as large background re¬ 
jection) is drawn in the rightmost position, and the oth¬ 
ers are exhibited in sequence of decreasing performance 
such as MT 2 {bU), rubi, MT 2 {bf-b£), MT 2 {bib), MT 2 {bb), 
and MT 2 {bi)- The diagonal line connecting (1,0) and 
(0,1) (black dotted lines) is drawn for a reference. We 
here omit the one for the (££) subsystem because it is 
hardly beneficial in selecting signal events against back¬ 
ground ones. In other words, it is below or close to the 
above-mentioned diagonal line in all range. In Table II, 
we also tabulate the cuts and signal efficiencies (e**^) of 
four sample points for which the background events are 
rejected by a rate of 99.9%, 99%, 90%, and 50%. The 
ROC curves suggest that four variables should provide 


with almost equally best efficiencies, which are the mbe 
and the Mt 2 in subsystems {b£b£), {b£b), and (bU): for 
example, 99.9% of background rejection vs. ^5% of sig¬ 
nal acceptance, 99% of background rejection vs. ^20% 
of signal acceptance, and so on. 

As the above-mentioned four are the best variables, it 
is interesting to investigate the correlation among them 
to see if there is any further improvement in the relevant 
discriminating power. One could attempt various com¬ 
binations among them. For example, Figure 9 demon¬ 
strates the unit-normalized two-dimensional temperature 
plots of MT 2 {blbi) vs. rubi for the tt (left panel) and 
the tW (right panel) events. Very roughly, we observe 
that the two variables have a positive correlation, i.e., 
as Mt 2 in the {bibt) subsystem increases, rubi increases 
as well, and vice versa. In particular, this trend is more 
manifest for signal events partly because both values are 
commonly dictated by the hardness of the additional jet. 
Hence, it is rather challenging to get a dramatic improve¬ 
ment by the introduction of simple schemes such as re¬ 
jection of events whose MT 2 {b£b£) and rubi values are 
simultaneously less than given respective cuts. We in¬ 
stead see that the background events tend to populate 
in a local region (lower-left corner in the figure), while 
the signal events spread over a (relatively) wider region. 
Given this observation, a potential improvement could 
be achieved by introducing an customized cut enveloping 
the background region in the left panel of Figure 9. We 
do not perform a detailed study in this direction because 
it is beyond the scope of this paper. 

B. Signal region II: lb + Ij + £'^l~ -|- fk 

The same strategy is readily available for Signal Region 
H. Event selection is done with Eqs. (3) through (5) but 
a slight modification of Eq. (6) as follows: 

Nj = 1 while Nb = I, > 30 GeV, < 2.4. (15) 

Once the jet is selected in this way, it is considered as 
another b-jet throughout the analysis later on. To pre¬ 
clude the inclusion of any extra loose jet, we additionally 
require that there should be only one jet even satisfying 
> 20 GeV and |? 7 ^j < 4.9. Although most of events 
come from either tW or ti, SR-H is contrasted with SR-I 
by a couple of qualitative differences. First, an enhanced 
signal-over-background is anticipated. Since the addi¬ 
tional jet is typically originated from ISR/FSR gluons, 
more tW + j events can pass the relevant selection cri¬ 
teria than those in SR-I. On the contrary, the ordinary 
dileptonic ti comes with two bottom quarks at the parton 
level, so that the requirement of a single regular jet and 
a single bottom jet reduces the background acceptance 
by the missing rate of bottom quarks. At the expense 
of gaining more signal acceptance, the signal separation 
from the background events becomes less efficient. The 
reason is that for ti there is more possibility that such an 
extra jet is from ISR which would have been rejected by 
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FIG. 7: Invariant mass distribution (left panel) and mi,e selection efficiency (right panel) of tt and tW events in SR-I. The 
distributions are plotted with the events passing the selection criteria in Eqs. (3) through (6) with one more fe-tagged jet is 
required. The relevant combinatorics is treated by the prescription explained in the text and Table I. The dashed lines indicate 
the expected endpoints of the ti system. 


2/+2Z» channel 



2l+2b channel 



FIG. 8: ROG curves (left panel) for Mt 2 and nibt variables and their magnification for the regime having a large background 
rejection (right panel) in signal region I. 


an additional 6-tagged jet. Like the signal process tW +j, 
the hard ISR jet can render even tt events exceed the ex¬ 
pected kinematic endpoint, and as a result, the signal 
efficiency becomes (slightly) reduced for a given back¬ 
ground rejection. 

Figure 10 shows Mt 2 distributions of ti and tW events 
for the (MM) (upper-left panel), {bb) (upper-right panel), 
{b£b) (middle-left panel), {b££) (middle-right panel), and 
{b£) (lower-left panel) subsystems and itim distribution 
(lower-right panel). We produce those distributions us¬ 
ing the events satisfying the selection criteria given in 
Eqs. (3)-(5) and (15). The combinatorial ambiguity aris¬ 
ing in all variables but the Mt 2 for the {bb) subsystem is 
taken care of by the same prescriptions elaborated in the 


previous subsection. The employed test masses are the 
same as the ones used in the corresponding Mt 2 variables 
in SR-I. As before, the black dashed lines indicate the ex¬ 
pected endpoints of the ti system. We observe that all 
distributions look very similar to the corresponding ones 
demonstrated in Figures 5, 6, and 7. However, we also 
observe that more background events leak beyond the 
associated kinematic endpoints as discussed before. To 
see the correlation of signal acceptance vs. background 
rejection, we plot the ROC curves in Figure 11. Like 
in SR-I, the right panel of it zoom in the region where 
the background rejections are large. The color code is 
the same as that in Figure 8. More quantitatively, we 
enumerate the cuts and signal efficiencies of four sam- 
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1 - e** 

MT2{bM) 

M 

T2{bb) 

MT2{b£b) 

MT2{bU) 

M 

T2{bl) 


rriM 

0.999 

258 

(0.056) 

203 

(0.036) 

258 

(0.052) 

253 

(0.050) 

171 

(0.024) 

253 

(0.049) 

0.99 

191 

(0.192) 

181 

(0.078) 

192 

(0.182) 

170 

(0.206) 

147 

(0.060) 

168 

(0.203) 

0.90 

164 

(0.332) 

159 

(0.169) 

167 

(0.311) 

143 

(0.351) 

125 

(0.159) 

140 

(0.350) 

0.50 

136 

(0.601) 

124 

(0.522) 

141 

(0.579) 

116 

(0.623) 

103 

(0.475) 

111 

(0.626) 


TABLE II: Signal efficiency (numbers in the parentheses) and the associated cuts in GeV for mi,e and Mt 2 in various 
subsystems with respect to SR-I. The numbers are tabulated for four representative background rejections, 1 — e**. 


2/+2Z> channel: unit—normalized tl events 2I+2b channel: unit—normalized tW events 




FIG. 9: Correlation plots in MT 2 {blbt) vs. nibi for tt (left panel) and tW (right panel). The vertical and horizontal dashed 
lines indicate the expected endpoints of MT 2 {blbl) and rribe for the tt system. 


pie points in Table III like Table II. Signal acceptance is 
somewhat worse than that in SR-I for large background 
rejection. But it becomes improved compared with that 
in SR-I as background rejection decreases. 


V. DISCUSSIONS AND OUTLOOK 

The top quark is the heaviest particle in the Standard 
Model and has the largest coupling to the Higgs boson. 
It may open up a new window toward new physics and 
therefore it is important to understand its properties. 
Very recently, production of a top quark in association 
with a W boson has been observed by the ATLAS and 
CMS collaborations. Most of kinematic properties of the 
signal {tW) are very similar to those of tt that is the 
dominant background. Multi-Variate Analysis has been 
adapted to discover the production of tW without de¬ 
tailed understanding of kinematics of the signal and its 
backgrounds. 

In this paper, we have re-examined the production of 
the single top and a W gauge boson in the Standard 
Model with a non-conventional strategy. Our suggestion 
is to consider tW + j instead of tW, which also modi¬ 
fies relevant backgrounds correspondingly. This next-to- 


leading order production for tW signifies the retrieval of 
the visible state of ordinary tt, the major background, 
under the assumption that such an additional jet mostly 
comes from one of the bottom quarks in it. Clearly, the 
relevant kinematic structure of the background is well- 
defined, so that the distributions in well-known kinematic 
variables such as the invariant mass and Mt 2 are fea¬ 
tured by weH-defined kinematic endpoints. This is con¬ 
trasted with the ill-defined kinematic structure for tW+j 
due to the fact that j is typically from ISR/FSR. As a 
consequence, it was observed that for tW + j, the kine¬ 
matic endpoints of aforementioned distributions are also 
z/Z-defined, i.e., the distributions are not bounded above. 
Based on these observations, we found that one could 
suppress tt background very efficiently with those vari¬ 
ables, while obtaining a high efficiency in the signal. The 
simple use of kinematic variables could have helped the 
earlier discovery by a large significance in combination 
with conventional channels. Since this method provides 
excellent background rejection, one could try to study 
other properties of top quark in this channel. We strongly 
encourage the ATLAS and CMS collaborations to re¬ 
visit their study on tW with our suggestions. Moreover 
searches for B' (bottom partner) in the tW final state 
may exploit the similar techniques. 
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FIG. 10: Mt 2 distributions of tt and tW events for the {blbi) (upper-left panel), {bb) (upper-right panel), {bib) (middle-left 
panel), {bU) (middle-right panel), and ipl) (lower-left panel) subsystems and mi,i distribution (lower-right panel) in SR-II. The 
distributions are plotted with the events passing the selection criteria in Eqs. (3)-(5) and (15). The combinatorics arising in the 
relevant variables is treated by the scheme in the text and Table I. The test mass for the decay side involving a lepton (only a 
bottom quark) is 0 GeV (80 GeV). The dashed lines indicate the expected endpoints of the tt system. 


We emphasize that our novel strategy is very general 
and can play a key role in separating signal and back¬ 
ground events even in the context of physics models be¬ 
yond the Standard Model. More specifically, the discus¬ 
sion in this paper is readily applicable to any processes 


that resemble the following structure: 

AA (Bb) {Bh) {Ccb) (Ccb ), (16) 

AB (Bb) (B)(Ccb) (Cc), (17) 

where the former represents pair-production of particle A 
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2 /+lZ>+l 7 channel 



2 /+lZ>+l 7 channel 



Signal efficiency: 


FIG. 11: ROC curves (left panel) for Mt 2 and mi,e variables and their magnification for the regime having a large background 
rejection (right panel) in signal region II. 


1 - 

]\^T2{b£bi) MT2{bb) AlT2{bib) MT2{bi£) MT2{bi) TflhE 

0.999 

0.99 

0.90 

0.50 

480 (0.003) 278 (0.003) 473 (0.003) 451 (0.005) 217 (0.004) 451 (0.004) 

297 (0.042) 202 (0.036) 292 (0.042) 285 (0.044) 159 (0.035) 284 (0.044) 

174 (0.318) 162 (0.155) 175 (0.305) 152 (0.340) 126 (0.153) 148 (0.346) 

138 (0.617) 125 (0.513) 143 (0.587) 118 (0.635) 103 (0.485) 113 (0.635) 


TABLE III: Signal efficiency (numbers in the parentheses) and the associated cuts in GeV for mte and Mt 2 in various 
subsystems with respect to SR-II. The numbers are tabulated for four representative background rejections, 1 — e**. 


while the latter represents single-production of particle A 
in association with particle B. Here A Bb {A Bb), 
B ^ Cc {B ^ Cc), and the bar denotes anti-particle. In 
supersymmetric models, one can imagine the following 
processes. 

(1) it* vs. ixi (or i*xt) where i ^ xt^ ^ b£~^i> and 
similarly t* —>■ x^b bi~i' 

(2) gg vs. gq (or gq*) where g ^ qq ^ qqxi 

The selection procedure targeting at the full visible state 
of the former processes inevitably demands an extra ob¬ 
ject for the latter ones, leading an ill-defined event topol¬ 
ogy for the latter ones only. Then the kinematic variable- 


based strategy proposed in this paper can help us sepa¬ 
rate the latter processes from the former ones. 
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